Scalable robust graph embedding with Spark

نویسندگان

چکیده

Graph embedding aims at learning a vector-based representation of vertices that incorporates the structure graph. This then enables inference graph properties. Existing techniques, however, do not scale well to large graphs. While several techniques using compute clusters have been proposed, they require continuous communication between nodes and cannot handle node failure. We therefore propose framework for scalable robust based on MapReduce model, which can distribute any existing technique. Our method splits into subgraphs learn their embeddings in isolation subsequently reconciles spaces derived subgraphs. realize this idea through novel distributed decomposition algorithm. In addition, we show how implement our Spark enable efficient effective embeddings. Experimental results illustrate approach scales well, while largely maintaining quality.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

MILE: A Multi-Level Framework for Scalable Graph Embedding

Recently there has been a surge of interest in designing graph embedding methods. Few, if any, can scale to a large-sized graph with millions of nodes due to both computational complexity and memory requirements. In this paper, we relax this limitation by introducing the MultI-Level Embedding (MILE) framework – a generic methodology allowing contemporary graph embedding methods to scale to larg...

متن کامل

Graph Embedding with Constraints

Recently graph based dimensionality reduction has received a lot of interests in many fields of information processing. Central to it is a graph structure which models the geometrical and discriminant structure of the data manifold. When label information is available, it is usually incorporated into the graph structure by modifying the weights between data points. In this paper, we propose a n...

متن کامل

Balanced Graph Partitioning with Apache Spark

A significant part of the data produced every day by online services is structured as a graph. Therefore, there is the need for efficient processing and analysis solutions for large scale graphs. Among the others, the balanced graph partitioning is a well known NP-complete problem with a wide range of applications. Several solutions have been proposed so far, however most of the existing state-...

متن کامل

Graph Clustering with Dynamic Embedding

Graph clustering (or community detection) has long drawn enormous aŠention from the research on web mining and information networks. Recent literature on this topic has reached a consensus that node contents and link structures should be integrated for reliable graph clustering, especially in an unsupervised setting. However, existing methods based on shallow models o‰en su‚er from content nois...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: Proceedings of the VLDB Endowment

سال: 2021

ISSN: ['2150-8097']

DOI: https://doi.org/10.14778/3503585.3503599